3000 points are sampled randomly and uniformly from the surface of a torus. We wish to make a 2 dimensional representation of these points such that local structure is preserved, i.e. points that are close together should stay together. We colour the points to give a sense of where they belong.

You must enable Javascript to view this page properly.

The Barnes-Hut t-SNE algorithm attempts to fit a model with fewer dimensions, preserving local structure. The lower-dimensional model looks a lot like an elastic torus was split open and flattened out, kind of like a popped torus-shaped balloon.

You must enable Javascript to view this page properly.

A fourth dimension is added that has a linear relationship with another dimension (v3 on the plot). We simply add the colour mapping as a fourth dimension, and see if we recover the 3 dimensional structure with the colour mapping in-tact. The columns of the 4 x 3000 matrix have been scaled.

You must enable Javascript to view this page properly.

Again, this looks like a torus-shaped balloon has been burst, only in this case it hasn’t been flattened out. Note the characteristic effect of Barnes-Hut pulling clusters together. The perplexity hyperparameter controls the degree to which this occurs. The objective function of the gradient descent algorithm is to minimise Kullback-Leibler divergence, however the solution space is non-convex so the optimal solution isn’t guaranteed. Nevertheless this is a fair representation of the original 4d object. If the matrix hadn’t been scaled, the dimension represented by colour would have dominated under Kullback-Leibler as we will see below,

You must enable Javascript to view this page properly.

Sure it’s pretty, but it doesn’t tell us much about v1, v2, or v3.

To increase the level of difficulty, all linear relationships are removed. Now the fourth dimension has a non-linear relationship with v3 and no correlation with v2 or v1.

You must enable Javascript to view this page properly.

You must enable Javascript to view this page properly.

Add dimension v5 which has a non-linear relationship to v1, v2 & v3, and a linear relationship with v4.

You must enable Javascript to view this page properly.

Add dimension v6 which gets weird fast.

A$v6 <- (A$v2)^2 + sqrt((A$v5-A$v3)^2)
with(A, plot3d(v2, v6, v5, type = "s", size = 1, col = cr[v4]))
with(A, plot3d(v3, v5, v6, type = "s", size = 1, col = cr[v4]))

You must enable Javascript to view this page properly.

We’re up to 6 dimensions, so feasibly we could explore them by looking at every combination (here’s another representation by way of example)

You must enable Javascript to view this page properly.

But for every dimension we add, we have d choose 3 possible spatial representations. If we have 10 dimensions, that’s 120 (or d-1 choose 3 = 84 if we use colour).

Setting perplexity = 750, the algorithm will take a long time to compute the embeddings, but the result is superb. If you take some time to explore all of the plots above, you will see that those shapes are embedded in the dimension-reduced t-SNE plot.

You must enable Javascript to view this page properly.